Approximability of Constrained LCS

نویسنده

  • Minghui Jiang
چکیده

Given two input sequences A 1 and A 2 and one constraint sequence B, C-LCS is the problem of finding a longest common subsequence C of A 1 and A 2 that is also a supersequence of B: A 1 = cgagggt A 2 = cgggagt B = cat C = cgagt Constrained Longest Commmon Subsequence is a natural extension to the classical problem Longest Common Subsequence, and has application to computing the homology of two biological sequences with a specific or putative structure in common (Tsai, 2003). Notations For a sequence S, let S[i] denote the letter of S at position i, let S[i, j] denote the subsequence of S starting at position i and ending at position j (the subsequence is empty when i > j), and let |S| denote the length of S. For two sequences S and T , let S T denote the concatenation of S and T , and write S T if S is a subsequence of T. For a sequence R and a non-negative integer r, let R r denote a sequence consisting of r repetitions of R concatenated together. Generalization The problem C-LCS can be easily generalized to a problem C-LCS(k, l) for an arbitrary number k of input sequences and an arbitrary number l of constraint sequences (Gotthilf et al. Here the input size n is the total length of the k input sequences and the l constraint sequences. For C-LCS(2, 1), the most basic version of the problem C-LCS on two input sequences A 1 and A 2 and one constraint sequence B, there are dynamic programming algorithms running in for some related results. The problem C-LCS(k, l) becomes intractable, however, when either the number k of input sequences or the number l of constraint sequences is unbounded. An early result of Middendorf (1995) on consistent sequences of type (Super, Sub) implies that even if the input and constraint sequences are over a binary alphabet, it is already NP-hard to decide whether a given instance of C-LCS(2, l) has a valid solution; see also Middendorf and Manlove (2004). Recently, Gotthilf et al. (2008) showed that if the sequences are over an arbitrary alphabet, then even if all constraint sequences have length 1, it is again NP-hard to decide whether a given instance of C-LCS(2, l) has a valid solution.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Variants of Constrained Longest Common Subsequence

In this work, we consider a variant of the classical Longest Common Subsequence problem called Doubly-Constrained Longest Common Subsequence (DC-LCS). Given two strings s1 and s2 over an alphabet Σ, a set Cs of strings, and a function Co : Σ → N , the DC-LCS problem consists in finding the longest subsequence s of s1 and s2 such that s is a supersequence of all the strings in Cs and such that t...

متن کامل

Constrained LCS: Hardness and Approximation

The problem of finding the longest common subsequence (LCS) of two given strings A1 and A2 is a well-studied problem. The constrained longest common subsequence (C-LCS) for three strings A1, A2 and B1 is the longest common subsequence of A1 and A2 that contains B1 as a subsequence. The fastest algorithm solving the C-LCS problem has a time complexity of O(m1m2n1) where m1, m2 and n1 are the len...

متن کامل

Quadratic-time Algorithm for the String Constrained LCS Problem

The problem of finding a longest common subsequence of two main sequences with some constraint that must be a substring of the result (STR-IC-LCS) was formulated recently. It is a variant of the constrained longest common subsequence problem. As the known algorithms for the STR-IC-LCS problem are cubic-time, the presented quadratic-time algorithm is significantly faster.

متن کامل

Longest common subsequence problem for unoriented and cyclic strings

Given a finite set of strings X, the Longest Common Subsequence problem (LCS) consists in finding a subsequence common to all strings in X that is of maximal length. LCS is a central problem in stringology and finds broad applications in text compression, conception of error-detecting codes, or biological sequence comparison. However, in numerous contexts words represent cyclic or unoriented se...

متن کامل

New efficient algorithms for the LCS and constrained LCS problems

In this paper, we study the classic and well-studied longest common subsequence (LCS) problem and a recent variant of it, namely the constrained LCS (CLCS) problem. In the CLCS problem, the computed LCS must also be a supersequence of a third given string. In this paper, we first present an efficient algorithm for the traditional LCS problem that runs in O(R log logn+ n) time, where R is the to...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • J. Comput. Syst. Sci.

دوره 78  شماره 

صفحات  -

تاریخ انتشار 2010